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Assessment of outcomes and learner progress is a primary concern in 
federally funded adult education programs. This concern is not new, but 
it has gained prominence over the past decade as legislative imperatives, 
such as the National Literacy Act of 1991 and the Government 
Performance and Results Act of 1993, have required federally funded 
programs to be more accountable for what they do. The lack of a 
consistent assessment system across states and across programs within 
states has impeded the documentation and reporting of results to state 
and federal stakeholders. Such a system is needed to demonstrate the 
difference that adult education makes in the lives of learners, the 
communities in which they live, and the nation as a whole. Welfare 
reform and the establishment of one-stop centers for education and 
training underscore the need for better and more compatible 
accountability systems across and within states (Short, 1997). 

The Workforce Investment Act (WIA) of 1998 called for the 
establishment of "a comprehensive performance accountability system to 
assess the effectiveness of eligible agencies in achieving continuous 
improvement of adult education and literacy activities ... in order to 
optimize the return on investment of Federal funds in adult education and 
literacy activities" (WIA, section 212. a). States now award adult 
education funding to programs that provide adult education services 
based on twelve criteria, which include the degree to which the program 
establishes performance measures for learner outcomes, past 
effectiveness in meeting (or even exceeding) these performance 
measures, and the maintenance of a high-quality information 
management system that can report participant outcomes and monitor 
program performance against the performance measures (WIA, section 
231.e). Adult education programs that offer English for speakers of other 
languages (ESOL) have much at stake in the movement to define and 





measure learner outcomes, for adult ESOL instruction is the fastest 
growing area in federally funded adult education programs in the United 
States (National Center for Educational Statistics, 1997). 

Adult ESOL programs have always grappled with how to measure and 
report a range of desired outcomes and satisfy the demands of each 
stakeholder: learners, teachers, program administrators, funding agencies 
and organizations, policymakers, and the general public. Learners want 
to know how well they are progressing in learning English. Teachers 
want feedback on the effectiveness of their instruction. Program 
administrators want to know how well they are meeting program goals 
and how they can improve their services. Those funding the programs as 
well as the general public want to know whether funds spent are yielding 
results. Policymakers want to know what specific practices are successful 
so they can establish guidelines for allocating future funds. A single 
approach to assessment may not provide enough useful information to 
satisfy each of these demands. 

Reports on testing and assessment from the early 1990s (Business 
Council for Effective Literacy, 1990; Sticht, 1990) show that very few of 
these concerns about assessment have been resolved. The tests being 
used in adult education-TABE (Test of Adult Basic Education), ABLE 
(Adult Basic Learning Examination), and CASAS (Comprehensive Adult 
Student Assessment System)-and in adult ESOL-BEST (Basic English 
Skills Test) and CASAS-are largely the same, as are the critiques of their 
validity and reliability. The call for research to help answer questions 
about the role and use of standardized tests and other assessments is still 
appropriate, and even the questions of what should be assessed and for 
what purpose are still being debated. However, the field has at the least 
made progress on the two following issues. First, it is generally 
acknowledged that tests developed for native English speakers are not 
appropriate for use with English-language learners. Second, certain 
segments of the field have recognized that assessment is but one 
component in a larger instructional system that includes standards for 
content, program design, staff development, and assessment. 

This chapter seeks to provide the field of education for adult English 
speakers of other languages-at local, state, and national levels-with a 
timely overview of the state of assessment in adult ESOL programs in 
the United States. It also seeks to provide a brief description of 
assessment reform initiatives in Knl2 education and in adult language 
education abroad that might serve as models for adult education in the 
United States. Our intent is to help program staff and state and national 
policymakers make informed choices about assessment measures and 
procedures and to foster a collaborative effort to build an accountability 
system that addresses the needs of each stakeholder in a more effective 




adult education instructional system. Many decisions about assessment 
were being made even as we were writing this chapter. Although 
legislative requirements demand that states have an accountability system 
in place by July 1999, the field will be debating assessment issues for a 
long time to come. 

For the purposes of this report, we are using the term assessment in a 
broad sense: to find out what learners want, know, and can do at the 
beginning of instruction (needs, placement, and diagnosis), throughout 
instruction (ongoing progress), and at the end of instruction (achievement 
and outcomes). The information presented is based on a literature review 
from the fields of language assessment and government policy, reviews 
of standardized tests currently being used in adult ESOL programs, and 
discussions with experienced adult ESOL educators. We explore three 
key issues and make recommendations based on the findings: 

■ What implications do legislative requirements for performance 
measures have for adult ESOL programs? 

■ What assessment tools and processes are available and how 
adequately do they meet the needs of all stakeholders? 

■ What insights can be gained from the assessment reform 
experience of Knl2 education and adult language education 
abroad? 



PERFORMANCE MEASURES IN ADULT EDUCATION 

The use of standardized tests to evaluate adult education programs was 
put into legislation for the first time in 1988, in amendments to the Adult 
Education Act (Business Council for Effective Literacy, 1990). In that 
legislation, states were required to evaluate the progress of at least one- 
third of their grant recipients using standardized tests (Sticht, 1990). The 
National Literacy Act of 1991, amending the Adult Education Act, 
required the U.S. Department of Education (ED) to develop indicators of 
program quality that would assist states and local programs in judging 
the effectiveness of programs that provide adult education services. The 
legislation specifically called for indicators in the areas of recruitment, 
retention, and educational gains. 

The Department of Education sought input from the field of adult 
education by reviewing state and local practices related to program 
quality, commissioning papers by experts in the field, holding focus 
groups, and working closely with the state directors of adult education 
(Office of Vocational and Adult Education, 1992). A quality program 
indicator was defined as a variable reflecting effective and efficient 
program performance. It was distinguished from a measure (data used to 




determine the level of performance) and a performance standard (the 
level of acceptable performance in terms of a specific numeric criterion). 

Under the area of educational gains, two indicators were identified: 

■ Learners demonstrate progress toward attainment of basic skills 
and competencies that support their educational needs. 

■ Learners advance in the instructional programs or complete 
program education requirements that allow them to continue their 
education or training. 

Sample measures for the first indicator included standardized test scores, 
competency-based test scores, teacher reports of improvements in 
communication competencies, and demonstrated improvement on 
alternative assessments (such as portfolios, checklists of specific 
employability or life skills, and student reports of attainment). Sample 
measures for the second indicator included rate of student advancement 
to a higher level of skill or competency in the program; attainment of a 
competency certificate, General Educational Development credential 
(GED), or high school diploma; and percentage of students referred to or 
entering other education or training programs. 

With the passage of the Government Performance and Results Act 
(GPRA) in 1993, more emphasis was placed on performance 
measurement as a requirement of government-funded program 
evaluations. Now, under the Workforce Investment Act, each state must 
negotiate acceptable target levels of performance on three core indicators 
with the ED that encompass the quality indicators identified as the result 
of the earlier legislation: (1) demonstrated improvement in skill levels in 
reading, writing, and speaking the English language, numeracy, problem 
solving, English-language acquisition, and other literacy skills; (2) 
placement in, retention in, or completion of postsecondary education, 
training, unsubsidized employment, or career advancement; and (3) 
receipt of a secondary school diploma or its recognized equivalent (WIA, 
section 212.b.2.A). The levels of performance for each core indicator are 
to be expressed in objective, quantifiable, and measurable form and must 
show the progress of each eligible program toward continual 
improvement of learner performance. 

Within their five-year plans for adult education, each state will establish 
levels of performance for programs to meet. States are in the process of 
preparing their plans for ED approval. They became effective July 1, 
1999. Each year, states will submit data on the core indicators to the 
secretary of education, who will issue reports on how each state is doing. 

To facilitate the accountability and reporting process, the ED has been 




working with the state directors of adult education to establish a National 
Reporting System (NRS). The ED has granted funding to the American 
Institutes for Research/Pelavin Research Center (AIR/ 

Pelavin) to help establish the system. The NRS will include a common 
set of outcome measures; a system for collecting data on these measures; 
and standard guidelines, definitions, and forms for reporting the data 
(Office of Vocational and Adult Education, 1997). 

The NRS draft outcome measures for adult English-language learners are 
"educational functioning-level descriptors" that describe what a learner 
knows and can do in three areas: speaking and listening, reading and 
writing, and functional and workplace skills. These functioning-level 
descriptors appear to combine features outlined in the CASAS: level 
descriptors and the Student Performance Level (SPL) descriptors. States 
are to use the functioning levels to report educational gains of learners in 
the programs they fund. 

Programs will determine an individual learner's entry-level and 
subsequent-level gains using a uniform, standardized assessment 
procedure that has been described in the state plan and approved by the 
ED (Office of Vocational and Adult Education, 1998). Illustrative 
examples of test benchmarks for each functioning level in the pilot NRS 
document include a range of CASAS Life Skills scores and SPLs. (The 
SPLs, developed under the auspices of the Office of Refugee 
Resettlement's Mainstream English Language Training Project [MELT], 
U.S. Department of Health and Human Services, 1985, are also 
descriptions of adult learners' language abilities. They are correlated to 
the BEST test; both the BEST and CASAS tests are reviewed below.) 
Unfortunately, the two test benchmark guidelines provided in the pilot 
draft do not cover the range of measures previously identified for the 
quality indicator of educational gains, nor do they allow for the 
flexibility that local programs may need. More examples need to be 
identified during the pilot field test and added to the final document. If 
they are not, there is a danger that in the need to satisfy the demands of 
policymakers and funding sources, assessment will become too narrowly 
focused on standardized test scores. This may lead to program designs 
that do not serve the needs of learners or the communities in which they 
live or that adequately assess what learners know and can do. The 
fundamental question of what is to be counted as success-and therefore 
what skills and proficiencies are assessed- needs to be addressed at the 
program, state, and national levels before the NRS is finalized. 



LANGUAGE AND LITERACY ASSESSMENT IN ADULT ESOL 

The Adult Education and Family Literacy Act (Title II of the WIA) 
defines literacy as "an individual's ability to read, write, and speak in 




English, compute, and solve problems at levels of proficiency necessary 
to function on the job, in the family of the individual, and in society." 
However, there have been many definitions of literacy over the years, 
including a school-based view of literacy as basic reading and writing 
skills, the functional view that is in the legislation, and a view of literacy 
as social practices. Even this latter definition is acknowledged in the 
"Family Literacy" designation noted in Title II of the WIA. The field of 
adult ESOL recognizes that there are many literacies, defined by how 
individuals use literacy in everyday life to achieve personal, family, job, 
and community participation goals (Crandall, 1992). Literacy includes 
the ability to complete a task or solve a problem, such as getting a 
driver's license, completing the GED, or finding a job; to support the 
learning of one's children; to comprehend print material (in one's first or 
second language); and more. 

Which literacy is to be assessed: development in reading and writing, 
speaking, mathematical ability, social practice, or all of these? What 
constitutes progress in these areas, and how is it assessed? Can a gain in 
general language proficiency on a given measure be considered 
sufficient, or is a variety of assessment instruments and processes 
needed? These are questions that must be answered and agreed on at the 
local, state, and national levels if we are to establish an accountability 
system that captures learner progress. 

Many adult ESOL programs use a combination of assessment tools to 
meet their program needs. These include standardized tests such as the 
CASAS and BEST, materials-based tests such as those accompanying 
text series, and program-based tools such as teacher-made tests and 
portfolios. However, the field of adult ESOL lacks a cohesive assessment 
system that enables comparison of learner achievement and program 
impact across the wide variety of programs (survival, preemployment, 
preacademic, workplace, vocational ESOL, ESOL for citizenship, ESOL 
family literacy) and the wide range of delivery systems (local education 
agencies, community colleges, libraries, community-based or volunteer 
organizations, churches, businesses, and unions). The lack of consistent 
assessment procedures from program to program is problematic for two 
major reasons. One is that it impedes the documentation and reporting of 
results to the satisfaction of all stakeholders. The other is that it 
frequently impedes the movement of learners from ESOL to vocational 
training and academic programs because learner pathways differ from 
program to program. 

Most programs use placement procedures to match students to the levels 
or courses offered. This may take the form of a standardized test, a 
program-developed test or interview, or a combination of these. The 
kinds of assessments that are used after placement depend to a large 




extent on the program's philosophy of language and learning, the roles of 
teachers and learners, and the measures of success as defined by the 
various stakeholders (Wrigley, 1992). Program staff need to juggle two 
important purposes for assessment: 

■ They need to assess and document the actual progress that learners 
are making toward English-language development and completion 
of learner goals. 

■ They must meet the legislative requirements of the WIA, which 
requires a standardized assessment procedure and performance 
measures, and the NRS, which links learner progress to proficiency 
descriptors. 

Therefore, they must select instruments and procedures carefully and, in 
many cases, use a combination of standardized and alternative 
assessments. 

Standardized Assessment Tools in Adult ESOL 

For the purpose of this chapter, a standardized assessment tool is one that 
has been developed according to explicit specifications, has items that 
have been tested and selected for item difficulty and discriminating 
power, is administered and scored according to uniform directions, and 
has dependable norms for interpreting scores (Ebel, 1979). Standardized 
tests are used in adult education programs in most states because they are 
easy to administer to groups, require minimal training for the teacher, 
and purport to have construct validity and scoring reliability (Solorzano, 
1994; Wrigley, 1992). 

Standardized tests reviewed for this chapter include the Adult Basic 
Learning Examination, the Test of Adult Basic Education, the Adult 
Language Assessment Scales (A-LAS), the Comprehensive Adult 
Student Assessment System (components appropriate for ESOL), New 
York State's Placement Test for English as a Second Language Adult 
Students (NYS Place), and the Basic English Skills Test. Because they 
were designed for native English speakers, the ABLE and TABE are now 
regarded as inappropriate for English-language learners. We have 
included them here because they were used in the 1970s and 1980s in 
many adult education programs that had ESOL learners in classes with 
adult basic education (ABE) learners, and they are still used in some of 
these programs today. The A-LAS parallels the ABLE and TABE 
in that it has both language and mathematics batteries, but it was 
designed for nonnative English speakers. The CASAS and BEST are the 
two most widely used standardized tests in adult ESOL programs. Both 
were developed for assessing nonnative English speakers. The NYS 
Place was selected because it is the only oral assessment besides the 




BEST that was identified by the California Department of Education 
Adult ESL Assessment Project (Kahn, Butler, Weigle, & Sato, 1995) as 
suitable as a placement tool for assessing speaking ability 
in programs that were implementing the California ESL Model Standards 
(California Department of Education, 1992), discussed below. Ordering 
information for these tests is included in the chapter appendix. 

ADULT BASIC LEARNING EXAMINATION (ABLE). The ABLE 
was designed for use with native, English-speaking adults who have 
limited formal education and is used primarily in ABE and GED 
programs and in prison education. It is an educational achievement test. 

It was not designed to be a language development test or even a test of 
language proficiency, although it does have a language subtest in Levels 
2 and 3. The ABLE is available in three levels differentiated by years of 
formal schooling, and it has six subtests: vocabulary, reading 
comprehension, spelling, language (Levels 2 and 3 only), number 
operations, and problem solving. The first four of these subtests relate to 
language proficiency and are described in Exhibit 6.1. Reviews of the 
full battery can be found in Fitzpatrick (1992) and Williams (1992). 

Although the ABLE is represented by its publisher, the Texas-based 
Psychological Corporation, to be an indicator of educational 
achievement, many of the items on it reflect a narrow concept of 
achievement. For example, there is a heavy preoccupation with the 
inflectional morphology of auxiliary verbs (for example, We are/was 
[verb]), which is highly differentiated across social groups. At the same 
time, there is 

a complete absence of attention to such language systems as com- 
plex nominals (for example, the combination of [noun] with [noun]), 
which continue to develop in adolescence and later across all groups. 

The use of the ABLE with the populations for which it was designed- 
ABE, GED, and prison education-is problematic, and transporting the 
test to the adult ESOL population compounds these problems. The test 
does not reflect what is known about stages or sequencing of English 
language development. Only Level 1 would be plausible for use in most 
adult ESOL programs. Furthermore, because the vocabulary test is 
presented orally and the writing is confined to words in isolation (a 
spelling test), the ABLE has very few items that actually measure literacy 
skills. 

TEST OF ADULT BASIC EDUCATION (TABE). The developers of 
the TABE made a conscious attempt to assess the basic skills taught in 
adult basic education programs. The publisher, CTB/McGraw-Hill in 
California, reports a systematic effort to limit cultural, gender, and ethnic 
bias; construct items with content appropriate for adults; and include 




items developed through item response theory (IRT) modeling (see 
Hambleton, 1991), with desirable psychometric properties such as item 
discrimination and range of difficulty. That effort has resulted in content 
that is generally accessible to immigrant adults. 

The TABE has three basic levels: E (easy, grade equivalents 1.6n3.9), M 
(medium, grades 3.6n6.9), and D (difficult, grades 6.6n8.9), with an 
upward extension to A (advanced, grades 8.6nl4.9). There are also a 
downward extension, L (literacy), a Spanish-language version, a 
computer-based version, a placement test, and several other associated 
products. Levels E through A consist of two mathematics tests 
(computation and applied mathematics), a reading test, a language test, 
and an optional spelling test. There is no writing test. The language tests 
are reviewed in Exhibit 6.2. 

The TABE is not suitable for learners in beginning-level adult ESOL 
classes. If given the locator test, many ESOL learners will be assigned to 
take the Level L test. That test, with its high proportion of literacy 
readiness items (twenty-three), may be useful for testing adults with little 
or no previous alphabetic literacy experience, but it is not appropriate for 
beginning English learners who are literate in their native language. The 
remaining twenty-seven items of the test, with too few items at the lower 
range of language development, will not detect the learning that can 
reasonably be expected in early levels of ESOL instruction. However, for 
the most advanced levels of adult ESOL instruction, the TABE may 
prove useful. 

ADULT LANGUAGE ASSESSMENT SCALES (A-LAS). The A-LAS, 
published in New York by McGraw-Hill, is designed to test the English- 
language skills needed for "entry level functioning in a mainstream 
academic or employment environment" (Duncan & DeAvila, 1993, p. 1). 
The A-LAS consists of two test batteries: a set of oral tests and tests of 
reading, writing, and mathematics, available in two forms. The reading 
and writing tests are described in Exhibit 6.3. 

The A-LAS was constructed for testing the language and literacy of 
adults learning English. That it was not adapted from another test is 
apparent in the vocabulary employed (carefully selected for English 
learners) and the adult level of the item content (for example, concerning 
employment). None of the items, however, seeks to differentiate stages 
of English-language development. Because the A-LAS attempts to test 
the full range of English skills, from no English to entry-level 
functioning for employment and academic work, it must contain items 
that range in difficulty from monosyllabic word recognition to essay 
construction. However, the reading test has relatively few items at any 
particular level of development. This may be sufficient for a placement 




instrument, but if used for assessing achievement, it will be difficult to 
detect the increment of learning that many adults display in the relatively 
short time they stay in programs. The A-LAS would be a much stronger 
tool for assessing achievement if it were available in multiple levels, 
allowing more items at each difficulty level. 

COMPREHENSIVE ADULT STUDENT ASSESSMENT SYSTEM 
(CASAS). The CASAS Web site (http://www.casas.org) lists the 
availability of more than one hundred standardized assessments and a 
variety of instructional and supporting materials developed by CASAS. 
The system was designed for adult basic education, workforce learning, 
special education, adult ESOL, and various other state and federal 
programs. CASAS began in the early 1980s as a collaborative effort 
between adult educators in California and the State Department of 
Education. Their goal was to develop an assessment model that would 
help adult education programs to implement competency-based 
education, as mandated by the 1982 California state plan for adult 
education (Center for Adult Education, 1983). Over the years, the system 
developers have identified more than three hundred competencies (that 
is, statements that describe adult functioning in employment and in 
society). They then developed and field-tested more than four thousand 
life skills reading items that assess those competencies. These items are 
the basis for the array of assessments now available from CASAS. 
Extensive training is required as part of the CASAS purchase agreement 
to ensure proper administration and use of the assessments. See Exhibit 
6.4 for a descriptive review of the assessments for ESOL populations, 
particularly the series of tests characterized as life skills. 

CASAS tests multiple modalities at multiple levels with multiple forms. 
Some of the tests are specifically constructed for the adult English- 
language learner population. In general, the item content is accessible for 
immigrant populations. All of the items are tied to a list of competencies, 
and all of the tests are scaled to a single, uniform scale of proficiency. 
Ranges of the scale are associated with highly generalized statements of 
language proficiency or skill levels. 

The CASAS system provides the instructor with a chart to construct an 
item-by-item, student -by- student summary, or class profile, for some of 
the assessments. The student-by-student class profile is keyed to 
corresponding CASAS competencies rather than directly to test item. 
Constructing such class snapshots of student performance allows 
teachers to see where students are performing well and where they need 
continued instruction. However, one might hesitate to provide instructors 
with such information because they might limit instruction to the content 
of the test, and this might pose a problem with the relationship between 
parallel forms of the tests. At any level, the items are selected not only to 




be of comparable difficulty across forms but to have a very high overlap 
of the competencies they represent. For example, at Level A, Life Skills 
Listening, twenty- six of the thirty-four items are drawn from 
competencies shared across the two forms (form 51 and form 52). Many 
of the competencies are very narrowly defined (for example, "Interpret 
clothing and pattern size"). When the class profile shows that many 
students did not get the item for a particular competency correct, there 
will be a tendency to focus instruction in that area. When that same 
competency is retested on the parallel posttest form, improvement could 
be expected, but this does not mean that there would be similar growth in 
all comparable competency areas. 

Another problem could develop if the instruction and testing are drawn 
into narrow content domains or competencies while the test results are 
being interpreted in very broad proficiency ranges. Because the posttests 
do not sample broadly across the competencies but rather concentrate on 
the taught areas, they will overstate proficiency when it is then 
interpreted in terms of broad skill levels. (For example, the test results 
may state that a person is low intermediate ESL, meaning she can satisfy 
basic survival needs and very routine social demands and can understand 
simple learned phrases easily. What the results do not state is that she 
may have this competency only in the area of clothing.) 

BASIC ENGLISH SKILLS TEST (BEST). The BEST, published by the 
Center for Applied Linguistics in Washington, D.C., is designed as an 
adult English-language proficiency test, focusing primarily on survival 
and preemployment language skills. It consists of two parts: oral 
interview and literacy skills. 

Like the CASAS assessments, the BEST was an outgrowth of the 
movement toward a competency-based approach to instruction for adult 
English-language learners in nonacademic programs. The development 
of the BEST was funded principally by the Office of Refugee 
Resettlement (ORR) of the U.S. Department of Health and Human 
Services (HHS). Teachers and administrators from ORR's Region 1 in 
Boston and the National Office in Washington, D.C., worked with test 
developers from the Center for Applied Linguistics (CAL) to develop the 
original 1982 version (form A). Three additional forms (B, C, and D) 
were developed and field-tested in 1984 with the help of staff from seven 
geographically diverse programs that were participating in the ORR 
Mainstream English Language Training (MELT) Project (National 
Clearinghouse for ESL Literacy Education, 1989). 

The primary goals of the MELT Project were to provide consistency 
among ORR-funded programs in the United States, continuity between 
the domestic and overseas training programs (mostly in Southeast Asia), 




and guidance for curriculum development, establishment of instructional 
levels, and assessment. Products other than the BEST included the 
Student Performance Levels (SPLs) (cited as test benchmarks in the 
NRS) and a core curriculum document that correlated topics and 
competencies to the SPLs (U.S. HHS, 1985). Although the BEST was 
primarily developed for use with English-language learners from 
Southeast Asia, many of the programs participating in the field test of the 
BEST also provided other refugee and immigrant populations with 
services and included those populations in the field test (Allene Grognet, 
director, BEST Development Project, personal communication, January 
14, 1999). However, the test population data reported in the MELT 
documents includes only refugee populations. 

The BEST was made available to ORR-funded programs from the ORR 
Refugee Materials Center in Kansas City, Missouri, until it closed at the 
end of 1987. At that time, CAL decided to reprint form B and make it 
available through CAL. Form C was eventually reprinted as well. The 
form B oral interview section is described in Exhibit 6.5. 

Using only forty-nine items, the BEST oral interview attempts to assess 
language proficiency from eight topic areas or domains, across eight 
proficiency levels, using the four response modalities of speaking, 
listening, reading, and writing. The result is that very few items are 
actually related to the theoretical model at each proficiency level. The 
length of the test and the number of items are constrained by the need to 
administer each interview individually. The consequence of this time 
constraint and the desire to develop a broadly defined scale of 
performance levels in a single test is that the test loses stability when 
used to predict the exact proficiency level of individual students. In fact, 
the test developers recognized that the BEST discriminates better at the 
lower SPLs (OnVI) than the higher ones (VllnX). In 1992, CAL 
convened a meeting with potential users of a higher-level BEST to 
explore the general design and preliminary specifications of a test that 
would discriminate at the higher levels (CAL, 1992, summary of the 
higher-level BEST test meeting). Lack of funding prevented the test 
development from proceeding. The BEST, however, does elicit extended 
responses for fluency questions so that the proficiency level of the 
learners can be probed more deeply than is allowed for in the NYS Place. 

NEW YORK STATE PLACEMENT TEST FOR ENGLISH AS A 
SECOND LANGUAGE ADULT STUDENTS (NYS PLACE). The NYS 
Place is primarily an oral picture description task, cued by brief oral 
questions from the examiner. The test has an optional initial oral 
screening component, or oral warm-up, to determine if the examinee has 
sufficient English to proceed with the test. These seven items are simple 
greetings and directives. If the examinee fails to answer three items in a 




row, testing is suspended. There is also a literacy screen with items 
asking the test taker to read letters, numbers, and words in isolation as 
well as one short sentence. Exhibit 6.6 presents a description of Form B, 
the only form currently available. 

The NYS Place, developed by the State Education Department of New 
York, is available from the City School District of Albany. It was 
developed specifically for the initial placement of adults in ESOL 
programs, and the content is generally accessible to immigrant 
populations. The NYS Place, along with the BEST, is one of the few 
tests of spoken English for adults. It is a highly structured test. Unlike the 
BEST, it does not give the examinee an opportunity to initiate or 
elaborate at any time. Therefore, the test protocol cannot be represented 
as eliciting authentic conversation. 

GENERAL OBSERVATIONS ON TESTS. There are probably as many 
definitions of language proficiency as there are programs. Because 
language has so many facets and so many uses, different tests approach 
different aspects of language proficiency. Over the years, proficiency 
testing has reflected changes in our understanding of language theory. It 
has moved from a structural view (for example, discrete point tests of 
grammar, phonology, and other components of language), through a 
sociolinguistic view (integrative tests such as cloze and dictation), to a 
communicative view (for example, oral proficiency interviews that assess 
the learner's ability to use language to carry out communicative tasks) 
(Manidis & Prescott, 1996). Today, given the focus on real-life, practical 
content in adult ESOL instruction and the goals of the learners, if a test 
does not in some way look at language as communication, it would seem 
to be missing much that is important. 

Nevertheless, most items in most tests do not relate directly to either 
theoretically or empirically derived understandings of adult English- 
language proficiency development. One might assume that if a test is 
constructed in English and requires responses in English, then higher 
scores will correspond to higher levels of English proficiency. But this is 
a very shaky foundation on which to build proficiency assessment. To 
the extent that the content of the items is known to and within the 
experience of all examinees, score differentials will more directly reflect 
language proficiency differentials. 

A number of the tests designed for English-language learners reviewed 
here relate scores to some broadly defined scale of proficiency levels. 
These proficiency levels tend to be described in very global terms, often 
corresponding to the complexity of communicative situations (for 
example, greetings and leave-takings contrasted to explanation and 
persuasion). The actual items in a test, however, may be particular to a 




competency area and may sample very narrowly from the broad ranges 
of behavior described by the proficiency levels. The generalizability from 
performance on the test items back to all situations consistent with a 
particular level description may be difficult to establish. When there is 
only one form of a test or a very small set of alternate forms, the 
possibility increases that learners begin to learn the test or that teachers 
teach to the test. 

The use of a single test form to assess the full spectrum of proficiency 
levels also means that most items will not match any particular test 
taker's current level of functioning-that is, there are too few items at any 
one level of proficiency. Tests with multiple levels usually make more 
accurate assessments of functioning level. This can also be achieved in 
direct measures of proficiency, such as oral interviews, if the items 
engage the examinee in extended dialogue so that proficiency is assessed 
based on learner response and not the difficulty of the question. 

Alternative Assessment in Adult ESOL 

In the reviews of standardized tests, we pointed out the difficulty of using 
a single assessment instrument to provide information useful for 
placement, instructional decisions, and accountability. This fact has led 
many adult ESOL program staff to develop program-based alternative 
assessments. Alternative assessment is "any method of finding out what a 
learner knows or can do, that is intended to show growth and inform 
instruction, and is not a standardized or traditional test (Valdez-Pierce & 
O'Malley, 1992, p. 1). 

Alternative assessments may take the form of performance assessment, 
portfolio assessment, or learner self-assessment. In performance 
assessment, the learner uses prior knowledge and recent learning to 
accomplish a task related to general language use or relevant to a specific 
context (Lumley, 1996). The learner response to, or outcome of, 
performance assessment may be an oral or written report, an individual 
or group project, an exhibition, or a demonstration (O'Malley & Pierce, 
1996). A portfolio is a systematic collection of learner work that 
represents progress and achievement in more than one area (Fingeret, 
1993). In learner self-assessment the learners monitor their own progress 
and accomplishments in order to select learning tasks and plan their use 
of time and resources to accomplish those tasks (O'Malley and Pierce, 
1996). 

Alternative assessments are consistent with emerging models of language 
acquisition. These models examine how people acquire language 
competence as they use the language in social interaction to accomplish 
purposeful tasks, such as to give and receive information and to make 
requests (August & Hakuta, 1997). 




During the past decade, several studies and publications on best practices 
have guided the development of alternative assessments. The most 
comprehensive that was written specifically for adult ESOL is Bringing 
Literacy to Life, a document prepared for the U.S. Department of 
Education by Aguirre International (Wrigley & Guth, 1992). Other 
publications include the following: 

■ Assessing Success in Lamily Literacy Projects: Alternative 
Approaches to Assessment and Evaluation (Holt, 1994) 

■ Adventures in Assessment (McGrail & Simmons, 1991nl998) 

■ Making Meaning, Making Change: Participatory Curriculum 
Development for Adult ESL Literacy (Auerbach, 1992) 

■ It Belongs to Me: A Guide for Portfolio Assessment in Adult 
Education Programs (Lingered 1993) 

■ Authentic Assessment for English Language Learners: Practical 
Approaches for Teachers (O'Malley & Pierce, 1996) 

The last book, though focused on Knl2 education, can be helpful to adult 
ESOL programs as well. It devotes considerable attention to assisting 
teachers in developing assessment tasks and scoring rubrics (that is, 
rating scales) that ensure the reliability and validity of alternative 
assessments. Examples of assessment tools and processes advocated in 
these materials include learner- teacher conferences; questionnaires and 
surveys; teacher observation forms; checklists of communication skills 
and behaviors (Crandall & Peyton, 1993); and learner reading, writing, 
and speaking logs. Learners might prepare narrative writings or keep 
journals in which they express what they have learned in class; what 
changes they have made in their language and literacy practices or 
interactions; and how their goals, needs, and interests have been met or 
modified (Auerbach, 1992). 

Although these alternative assessments are not standardized, they should 
be consistent with the following principles: 

■ They are program based, reflecting the program's underlying 
philosophy of instruction. 

■ They are learner centered, reflecting the strengths and goals of 
individual learners. 

■ They are done with the learner, not to the learner, so that learners 
are actively involved in setting goals, discussing interests, deciding 
what to evaluate, and reflecting on their accomplishments. 

■ They focus on the learning process as well as outcomes, allowing 
learners to reflect on their progress and make changes in how they 
are using their time and resources. 




■ In addition to the linguistic dimension of language and literacy 
development (for example, vocabulary and grammar), these 
assessments focus on the metacognitive (for example, developing 
learning strategies) and affective (for example, increased 
confidence) dimensions. 

■ They involve a variety of procedures, not just a single process 
or tool. 

Alternative assessments provide teachers and learners with valuable 
feedback on learner progress and instructional changes that may need to 
be made. 

Despite the fact that alternative assessment is guided by these principles 
and affords greater flexibility than standardized tests in gathering a more 
complete picture of what learners know and can do, its use as a means for 
accountability has raised serious questions. Without the development of 
guidelines and rigorous procedures for the collection and evaluation of 
evidence of learner performance and without the proper training of staff 
in how to carry out the assessments, alternative assessments do not 
produce the reliable, hard data that sources of funding require. To dispel 
the uncertainty of the subjectiveness associated with such assessments, 
administration procedures and conditions would have to be strictly 
monitored and a minimum of two raters involved in assessing student 
performance (Lumley, 1996). Furthermore, the program- specific nature 
of the assessments, along with difficulty in aggregating the data across 
programs, make program comparisons difficult. 



EDUCATION REFORM INITIATIVES THAT INFORM ADULT 
ESOL ASSESSMENT 

The issues of what and how to assess in language and literacy 
development are not unique to adult ESOL education. Assessment reform 
movements in Knl2 education and language education abroad face 
similar challenges, although in some respects they are ahead of adult 
education in the United States, and the field can learn from their 
experiences. The next two sections briefly review some initiatives in 
Knl2 education and abroad. 

Knl2 Education 

In Knl2 education in the United States, the assessment reform of the 
early 1990s was tied to the Knl2 standards movement, which resulted 
from the passage of Goals 2000 legislation and the Improving America's 
Schools Act of 1994. A number of national, state, local, and professional 
groups have been developing content standards -what students should 
know and what schools should teach and assess-in a broad array of 
subjects, including art, foreign language, geography, history, language 




arts, mathematics, science, and social studies. In some content areas, 
performance standards-what students must know and be able to do to 
demonstrate proficiency in the content area-are also being developed. 
Performance standards can be used to guide the development of 
assessment tools and processes. A third type of standard, referred to as 
opportunity to learn, defines what resources (for example, staffing and 
materials) are needed to make sure students will be able to meet the 
content and performance standards (August & Hakuta, 1997). 

Early in the standards movement, Teachers of English to Speakers of 
Other Languages (TESOL) formed a task force to ensure educational 
equity and opportunity for students learning English as a second or 
additional language. The task force became concerned that many of the 
Knl2 academic content standards did not take into account the important 
role of language in academic achievement. The task force developed 
ESOL standards for pre-Knl2 students that specify the language 
competencies English-language learners need to become fully proficient 
in English, both socially and academically (TESOL, 1997). 

The framework for the standards is based on three overarching goals: 
developing competence in English in social language, academic 
language, and sociocultural knowledge. These goals are supported by 
nine standards, organized by grade-level clusters (pre-Kn3, 4n8, and 
9nl2), that can be used to guide curriculum development at the state or 
local level. Descriptors of representative behaviors that demonstrate a 
standard is being met, along with sample progress indicators, are 
included in the document to assist in the development of curricular 
objectives and benchmarks for reporting student performance. Examples 
are included of how the sample progress indicators can be used across 
grade-level clusters to account for different English-language proficiency 
levels (beginning, intermediate, advanced, and limited formal schooling). 

Exhibit 6.7 shows how tasks can be constructed across proficiency levels 
to demonstrate the same standard. It can be used to guide the 
development of assessment tasks to monitor the progress not only of 
Knl2 English-language learners but also of adults. The goal in the 
example, "To use English to achieve academically in all content areas," 
is admittedly geared toward academics in pre-Knl2 education. But the 
standard, "to obtain, process, construct, and provide subject matter 
information in spoken and written form," describes uses of language that 
adult learners identified as important to their own literacy development in 
the adult-focused Equipped for the Future Project (Stein, 1997) and that 
employers defined as important in a high-performing workforce 
(Secretary's Commission on Achieving Necessary Skills, 1993). The 
TESOL pre-Knl2 standards document, along with its accompanying 
Scenarios for ESL Standards-Based Assessment (TESOL, 1999), can 




serve as a model for creating performance assessment tasks that probe 
the functioning proficiency level of the adult learner. 

Adult Language Education Abroad 

Developments in assessing language learners in Australia, Canada, and 
Europe can inform efforts in adult education in the United States. 
Stakeholders in each country have expressed the need for a common way 
to measure and describe learner outcomes across a wide variety of 
programs. In Australia and Canada, there are strong links between 
immigration policy and economic and labor policy. As Europe moves 
toward closer ties under the European Union, educators face the daunting 
task of linking assessments in various member languages to a common 
framework in order to compare language certification across member 
countries. 

AUSTRALIA. Australia traditionally has welcomed immigrants. Before 
World War II, most immigrants to Australia came from Ireland and the 
United Kingdom. After the war, Australia's immigration policies looked 
to non-English- speaking countries to meet the demands for new settlers. 
As a consequence, the Adult Migrant Education (now English) Program 
(AMEP) was established to provide these new settlers with English- 
language instruction and settlement information. 

The AMEP has become the largest government-funded English-language 
training program in the world. At first a centralized curriculum was 
designed by the government, but in the 1970s curriculum development 
was decentralized so that teachers at individual programs became the 
primary developers of curriculum, learner placement, needs assessment, 
and procedures for monitoring progress (Burns, 1994). In the 1990s, a 
collaborative initiative was undertaken by the Australian government, 
industry, and unions to develop a curriculum framework linking industry 
needs with training in workplace competence (Hager, 1996). This 
occurred in response to changing world economic conditions, a need for 
increased continuity between language-training programs and vocational 
and job-training programs, and a demand for measurable outcomes. The 
result has been the development of a number of competency-based 
curriculum documents that are nationally or state accredited. Among the 
most well known is the Certificate in Spoken and Written English 
(CSWE), developed by the New South Wales Adult Migrant English 
Service and the National Center for English Language Training and 
Research (NCELTR). 

The aim of the CSWE is to enable adult English-language learners to 
develop the language and literacy skills that will enable them to 
participate in further education or training, seek and maintain 
employment, and become contributing members of the community. 




Competencies that describe what learners can do at the end of study are 
identified at three stages of proficiency (beginning, postbeginning, and 
intermediate), based on the Australian Second Language Proficiency 
Ratings (ASLPR). Within each stage, learners can be grouped by a slow, 
standard, or fast learning pace to accommodate for differences in 
educational experiences and native language literacy abilities. 

The CSWE document is considered a curriculum framework rather than a 
syllabus, from which individual programs can create their own courses of 
study. However, the document specifies the explicit criteria under which 
the competencies must be assessed. Within these criteria, the 
competencies may be assessed through teacher observations, interviews, 
role plays, learner self-assessments, and other means, following the 
guidelines. Possible exit points occur at the end of each stage, but the 
certificate is issued only after Stage 3 competencies have been achieved. 
It takes a learner approximately 250 to 300 hours to pass through a stage. 
(For a detailed description of the CSWE, see New South Wales Adult 
Migrant English Service, 1993.) 

Initially teachers were concerned about the appropriateness of formal 
assessment with their learners, the time it takes to assess performance, 
and validity and reliability issues among teachers and across programs. 
After an initial adjustment period, teachers found that the new system 
enabled them to focus their teaching in a way that provided clearer 
direction and more explicit feedback to learners on progress (Burns & 
Hood, 1995). Issues of validity and reliability continue to concern 
Australian adult educators as proficiency assessments based on rating 
scales are increasingly used for accountability purposes (Manidis & 
Prescott, 1996; Brindley, 1999). 

Another concern is the apparent inadequacy of the system for assessing 
the progress of learners who have limited experience with formal 
learning, low literacy levels in both their native language and English, 
and difficulty adjusting to new cultural expectations (Jackson, 1994). 
These same concerns for the low-level learner have surfaced in the 
United States as well. These learners tend to require longer periods of 
instruction and make smaller language gains than more literate learners. 
The gains they make tend to be nonlinguistic outcomes in the affective, 
cognitive, and sociocultural domains. 

A research project undertaken by the NCELTR identified eight major 
categories of nonlinguistic outcomes: confidence; social, psychological, 
and emotional support in one's living and learning environments; 
knowledge of social institutions; cultural awareness; learning skills; goal 
clarification; motivation; and access and entry into further study, 
employment, and community life (Jackson, 1994). Despite the 




incorporation of nonlanguage outcomes in the knowledge and learning 
competencies of the CSWE (New South Wales AMES, 1993), teachers 
lack adequate tools to assess and document these outcomes. Research has 
shown that these outcomes reflect characteristics of good language 
learners and appropriate language teaching methodologies (Jackson, 
1994). They also reflect characteristics and skills favored in the 
workplace (Hager, 1996; Secretary's Commission on Achieving 
Necessary Skills, 1993). They should be considered in every assessment 
system, and teachers need help in finding ways to assess and record 
them. 

CANADA. In response to the field's expressed need for a common set of 
standards to measure and describe language development of learners in 
Canadian adult ESOL programs, Employment and Immigration Canada 
(now Citizenship and Immigration Canada [CIC]) developed national 
English-language benchmarks with the assistance of a working group of 
ESOL learners, teachers, administrators, immigrant service providers, 
and government officials (CIC, 1996; Pierce & Stewart, 1997). 

The Canadian Language Benchmarks (CLB) project, launched in spring 
1996, is a task-based descriptive scale of ESOL proficiency. It identifies 
twelve benchmarks across three stages of proficiency (basic, 
intermediate, and advanced) in three skill areas: speaking/listening, 
reading, and writing. Each benchmark describes a person's ability to use 
English to accomplish a task and includes information on the abilities of 
the person (for example, "Can copy information, describe personal 
situations"), performance conditions (for example, "Copies words and 
numbers clearly and accurately"), situational conditions (for example, 
time limit, length of text, number of mistakes allowed), background 
knowledge helpful to completing a task (for example, cultural 
expectations, availability of community services, note-taking 
conventions), and sample tasks. As compared with the American SPL 
descriptors, the level of detail in the Canadian benchmarks makes the 
document more useful for creating assessment tasks. The CLB does not 
purport to be a proficiency test or a syllabus, but curriculum writers, 
materials developers, and practitioners can use it as a guide for syllabus 
development, instruction, and monitoring of learner progress. (For a 
detailed description of the benchmarks see CIC, 1996.) 

Concurrent with the field testing of the draft CLB document, CIC 
contracted with the Peel Board of Education in Ontario to develop the 
Canadian Language Benchmarks Assessment (CLBA) to assist in placing 
learners in programs and assessing learner progress. It was to be a task- 
based assessment instrument that addressed Benchmarks 1 through 8 
(Stages 1 and 2) in each of the three language skill areas. The developers 
faced many challenges: the instrument had to consist of tasks 




representative of those identified in the CLB document, reconcile the 
assessment of each separate skill area with the holistic approach implicit 
in task-based assessment, balance cultural bias and authenticity of task, 
and be user friendly-for both the test giver and the test taker-across a 
wide range of program types and settings (Pierce & Stewart, 1997). 

The resulting CBLA consists of three separate instruments-one for each 
of the skill areas. The listening/speaking assessment is administered one- 
on-one and is scored by the interviewer as it is being administered. It 
takes ten to thirty minutes to complete. The reading and writing 
assessments can be administered individually or in a group setting and 
take at least another one to two hours to administer. There are parallel 
forms of the reading and writing assessments-two to be used for 
placement and two for assessing outcomes. 

The developers consider the CBLA a work in progress. Assessors are 
being trained, and an interrater reliability study is underway. They 
caution, however, that it remains a low-stakes instrument that would need 
to have its validity and reliability enhanced before it could be used for a 
high-stakes assessment such as job entry, academic opportunity, or 
immigration (Pierce & Stewart, 1997). 

EUROPE. The assessment reform effort in Europe is perhaps more 
complex than the efforts in Australia, Canada, and the United States 
because the Europeans are attempting to correlate assessments in many 
languages to a common set of standards. Two projects in particular, the 
Council of Europe's Common European Framework and the Association 
of Language Testers in Europe's (ALTE) Framework Project, are worth 
noting. The first establishes a comprehensive framework for the 
description of language proficiency and its relationship to content. The 
second then applies this framework to the various examinations and 
certificates offered in member countries to promote the recognition of 
language certification across Europe. 

Common European Framework. The Council of Europe assists its forty- 
four member states in encouraging all citizens to learn nonnative 
languages to promote mutual understanding, personal mobility, and 
access to information in a multilingual, multicultural Europe (van Ek & 
Trim, 1996). The second draft of the Modem Languages: Learning, 
Teaching, Assessment-A Common European Framework of Reference 
(Council for Cultural Cooperation Education Committee, 1996) 
represents the council's latest effort in a collaborative process that began 
in 1971. The document identifies four contexts for language use 
(personal, public, occupational, and educational) and specifies ranges of 
language knowledge and skills. Language programs can use the 
document to develop curricula, select instructional approaches and 




techniques, and establish assessment procedures. It includes information 
on the purposes of communication (for example, completing insurance 
forms, taking public transportation, reporting an accident); 
communication activities (productive, receptive, interactive) and 
processes (for example, plan, organize, execute); texts (for example, 
spoken: public speeches, news broadcasts; written instructions, letters, 
signs); learning strategies; the processes of language learning and 
teaching; establishment of levels and scales; and assessment purposes 
and types. It is not prescriptive in the sense of telling practitioners what 
to do or how to do it. Rather it provides a range of elements for programs 
to choose from and establishes principles for language teaching, learning, 
and assessment to facilitate program design. It thus provides a common 
basis for discussing these choices among programs. 

ALTE Framework Project. The ALTE, established in 1989, is an 
association of European institutions that offers learners language training 
and certification in the language of the institution's country or region. 
Fifteen languages are represented among the association's eighteen 
members. As the European workforce becomes more mobile within the 
European Union, both employees and employers need to know what 
language qualifications mean across countries. To meet this need, ALTE 
members are establishing common levels of proficiency and common 
standards for the language testing process (ALTE, 1998). They have 
drawn from previous work of the Council of Europe that specified the 
descriptions of language ability (van Ek & Trim, 1996) as well as from 
their collaboration with the council on the Common European 
Framework project. 

The ALTE framework identifies five main levels of proficiency, each 
defined by statements of what a user can do at that level across the 
productive (speaking and writing) and receptive (reading and listening) 
skill areas. Examinations given by ALTE members have been charted 
along this continuum by comparing the content and the demands each 
examination makes on the examinee (ALTE, 1994); this allows for 
comparisons of skill level attainment across language tests. For example, 
a learner of French as a foreign language achieving the Diplome de 
Langue FranAaise issued by Alliance FranAaise would have the same 
proficiency in French that a learner of Italian would have in Italian upon 
achieving the Certificato di Conoscenza della Lingua Italia, Livello 3. 
The "can do" statements for each level are currently being validated so 
that they can be used to describe what a language test score actually 
means in the real world (for example, "Can offer help to a client or 
customer: el'll give you our new catalogue'") (Jones, 1999). 

Works in Progress. Both the Council of Europe and the ALTE consider 
their frameworks to be works in progress-to be used, commented on, and 




further developed in response to experience gained from using the 
documents, to developments in research in language acquisition and 
learning, and to new needs that may arise. 

Lessons from These Assessment Initiatives 

Assessment initiatives in adult education in the United States can apply 
the lessons learned from Knl2 efforts and adult-language education 
efforts abroad. The primary lesson to be drawn from these efforts is that 
in the overarching structure, assessment is but one component in a larger 
instructional system. The Knl2 system is well established in the United 
States. Australia's Adult Migrant English Program was strongly backed 
by government funding initiatives and linked to a major government- 
funded research institute, the National Center for English Language 
Training and Research (NCELTR), which works with practitioners to 
produce research and provide training that reflect the practical classroom 
issues language teachers face (for example, assessing oral proficiency 
[Manidis & Prescott, 1996]). The Common European Framework project 
is producing a comprehensive compendium of elements of a language 
instructional system so that programs can base their instructional designs 
on solid knowledge about language acquisition and principles of 
language teaching and assessment. 

These initiatives also show that assessment reform does not begin with 
assessment procedures but with content standards-what learners need to 
know and be able to do to function successfully in the communities in 
which they live. Once the content is identified, curriculum and 
instructional approaches are chosen and performance standards and 
assessment procedures developed. This does not mean that every 
program that uses the pre-Knl2 TESOL standards document, the 
Canadian Language Benchmarks, or the Common European Framework 
will be teaching and assessing the same things in the same way. 

However, these documents enable individual programs to select elements 
that match program and learner purposes and goals, while at the same 
time providing a common frame of reference among the programs. 

Although the requirements of the Workforce Investment Act and the 
development of the National Reporting System represent strong moves 
toward a national accountability system in the United States, by and 
large, there is no national comprehensive learning system for adult 
education (Merrifield, 1998) with which accountability and assessment 
fit coherently. What to assess, how to assess, how to report the data, what 
they mean, and how to use them are questions that still need to be agreed 
on. This does not mean that traces of such a system do not exist. In fact, 
there are initiatives at local, state, and federal levels that indicate the field 
is lurching toward building a system. For example, many programs have 
developed instructional designs with articulated mission statements and 




goals, approaches to curriculum and instruction, and procedures for 
assessment. The way these components are defined varies, depending on 
the type of program and the goals of the program and its learners. What 
is lacking is a common framework, as in the European model, to enable 
programs to compare what they are doing across their varying types and 
missions. 

Several states, including Massachusetts and California, have elements of 
statewide systems in place. California was the first state to identify 
standards for adult ESOL. The English-as-a-Second-Language Model 
Standards for Adult ESL (California Department of Education, 1992) 
provides standards for program design, curriculum, instruction, and 
student evaluation. The document also identifies proficiency levels and 
describes course content and sample lessons for each level. The 
Massachusetts effort to establish standards is part of a larger effort across 
the state to implement standards in all of adult education. The first step in 
the process is to establish a curriculum framework that identifies what 
learners need to know and be able to do. From this framework, 
assessment and implementation phases will follow (Massachusetts 
Department of Education, 1998). 

On the national level, the National Institute for Literacy's Equipped for 
the Future (EFF) project is attempting to answer the question, "What is it 
that adults need to know and be able to do in order to be literate, to 
compete in the global economy, and to exercise the rights and 
responsibilities of citizenship?" (Stein, 1997, p. 2). The goal is to 
establish content standards that articulate what adults need to know and 
be able to do to be effective family members, community members, and 
workers. These standards will cut across the four purposes of literacy 
identified in phase 1 of the project: gain access to information, give voice 
to ideas, act independently, and build a bridge to the future by learning 
how to learn. ESOL programs in several states have been involved in the 
development and pilot testing of the content standards. These standards 
can provide the basis for establishing performance standards from which 
programs and states can develop assessment systems. In fact, the EFF 
team is already beginning to conceptualize what an assessment system 
for the standards might look like (National Institute for Literacy, 1999). 

Another project at the national level is the "What Works" Study for Adult 
ESL Students Evaluation, a five-year study sponsored by the U.S. 
Department of Education and being conducted by the Pelavin Research 
Center and Aguirre International. Its purpose is to evaluate the 
effectiveness of instructional approaches for adult English-language 
learners with limited literacy skills and then make recommendations to 
the field. A small part of the study will examine the types of assessment 
and outcome measures that are being used and identify those that are 




most appropriate for these learners. 



Finally, TESOL has appointed a task force to develop adult education 
program standards and sample performance measures that will address 
program goals, structure, implementation, curriculum, instruction, and 
assessment (Bitterlin, 1997). This document will be a useful guide for 
program staff and policymakers in establishing performance measures to 
ensure high-quality educational services for adult learners of English. 



RECOMMENDATIONS FOR POLICY, PRACTICE, AND 
RESEARCH 

The field of adult education in the United States is in the midst of 
implementing accountability systems, as is the field of Knl2 education. 
However, in adult education this is being done without the benefit of the 
infrastructure of the Knl2 system. Programs and states need support 
from both researchers and those providing funds to develop 
accountability systems that are part of a larger adult education 
instructional system, effectively capture what learners know and can do, 
and can be efficiently and realistically carried out at the program level. 
The following recommendations focus on the policies, practice, and 
research needed to make this possible. 

Policy 

The Workforce Investment Act (with its emphasis on performance 
measures) and the National Reporting System (with its emphasis on 
functioning levels) are driving the types of accountability systems that 
states are formulating. The NRS gives only limited examples of test 
benchmarks -ranges on CASAS tests and SPLs-to assess learner 
functioning levels. The SPLs are descriptors of learner abilities and not a 
test. Determining the SPL of an individual learner can be accomplished 
by correlating the local program level to which the learner 
is assigned to the SPL or assigning the SPL according to teacher 
judgment and verifying with the BEST test (U.S. HHS, 1985). Also, 
although not as detailed as the Canadian Language Benchmarks, the 
SPLs can guide the development of informal performance-based 
assessments (Grognet, 1998). However, it is doubtful that these two 
benchmarks (CASAS ranges and SPLs) are adequate for assessing the 
functioning level gains of learners for accountability purposes in all types 
of adult ESOL programs. The BEST test or formal performance-based 
assessments (valid, reliable, consistently administered, and perhaps using 
multiple raters) correlated to the SPLs would need to be administered to 
ensure that a standardized assessment procedure was used to assess 
learner gains using the SPLs. Lurthermore, the assumptions about 
curricular content and instructional approaches that underlie the CASAS, 
and to some extent the BEST, are not necessarily the assumptions that 




undergird all adult ESOL programs. In this chapter we have also pointed 
out the dangers of assuming that a higher test score translates into 
increased language proficiency when the tests contain too few items at 
any single proficiency level or assess too few contexts to ensure an 
accurate proficiency level assessment. 

First, agreement needs to be reached as to what constitutes literacy 
development among adult ESOL learners and how it can be measured for 
accountability purposes as learners progress. Assessment measures 
should reflect one's view of literacy and be comprehensive enough to 
show the full range of learner achievement. If the measures are to be 
made into tests, the resources are needed to improve the instruments now 
being used. The SPLs should be revised and revalidated in view of 
increased knowledge of language development and changing practices 
since they were first formulated in the 1980s (Grognet, 1998). Sample 
tasks for each proficiency level, such as those given in the TESOL 
standards document and the CLB, need to accompany the descriptors. If 
new standardized instruments are to be developed, they should be able to 
accommodate learners at every stage of language and literacy 
development. If alternative measures are acceptable, programs will need 
clear guidelines on how to select or develop those measures that are most 
appropriate for their learners and meet the legislative requirements for a 
standardized and rigorous process. 

In addition, consideration must be given to the long time adults need to 
become fluent in another language. SPL studies indicate that a low-level 
learner (SPL 1) needs between 1 10 and 235 hours to advance to the next 
proficiency level, depending on the characteristics of the program (for 
example, intensity of instruction and class size) and the learner (for 
example, educational experience, health, and motivation) (U.S. HHS, 
1985). Australian studies support the need for many instructional hours, 
especially for learners with low literacy skills, to show gains in 
proficiency level. The National Evaluation of Adult Education Programs 
(NEAEP) study, conducted between 1990 and 1994, reported that the 
median amount of time that ESOL learners stay in programs is 1 13 hours 
(Fitzgerald, 1995). Furthermore, the initial gains that low-level learners 
make tend to be nonlinguistic. As mentioned earlier, these are the skills 
and knowledge that may make it possible to continue learning and may 
be important in the workplace, but they do not show up on existing tests. 
Any test or proficiency scale needs to describe sufficiently the 
benchmarks on the way to achieving the next level, so that even learners 
with limited literacy skills or little time available to devote to formal 
instruction can show progress. 

Programs need additional resources to implement assessment 
requirements-whether in the development of rigorous performance 




assessments, teacher training on using the assessment tools and 
procedures, or the ability to compensate teachers for time to assess 
learners and document results. Policymakers at the federal, state, and 
local levels must work with program administrators to determine how 
resources can be allocated to provide this support. 

The first few years of WIA implementation will yield rich data for 
analysis of what is working, for whom, and what adjustments may have 
to be made to make the system work for everyone. The development of 
databases about what works best for various types of learners can 
identify elements of program design that will meet the needs of each 
stakeholder. Program staff need this information in clear and 
understandable language and in a timely fashion if they are to make well- 
informed decisions. Researchers need this information to guide their 
research and development efforts. 

The development of an accountability system should be just one 
component in an educational system that also encompasses guidelines for 
curricular content and instructional practice. A comprehensive system 
such as the Common European Framework presents a model for 
policymakers, program administrators, practitioners, and researchers to 
consult as results from EFF, the "What Works" Study, the NRS, and 
various research studies are reported. 

Practice 

Practitioners need to be aware of efforts such as EFF, the "What Works" 
Study, and TESOF's Task Force on Program Standards. EFF can provide 
program staff with information for evaluating curriculum content, 
thereby helping to increase their awareness of what counts as success to 
learners across a spectrum of program types and, therefore, what should 
be taught and assessed. The "What Works" Study can shed light on best 
practices for learners with limited literacy skills in both instruction and 
assessment. TESOF's program standards can be used to guide the 
development of assessment procedures within adult education programs. 
These three efforts can work together to provide the types of programs 
that meet the needs of each stakeholder. 

As practitioners consider what counts as progress and how it can best be 
measured for the learners they serve, they must be able to present and 
support their decisions before administrators and funding agencies and 
organizations. Selection of assessment measures should be made with 
solid knowledge about the proficiencies of learners, the educational 
context of the program, and the adequacy of the assessment measures 
being considered to carry out the purpose of the assessment. 



Research 




To inform both policy and practice, research efforts should address issues 
that the field itself has identified as critical in the area of assessment and 
outcomes. These issues are compiled in the Research Agenda for Adult 
ESL (CAL, 1998, p. 11), in which the following questions are raised: 

■ What immediate and long-term impact can be expected from the 
various types of adult ESOL programs? What impact does learner 
participation in such programs have on the learners and their 
communities? 

■ How can adult ESOL programs best capture what learners know 
and what they have learned? 

■ What is the cost in time, staffing, and funds to assess and 
document learning outcomes effectively? 

■ How can each of the stakeholders in a program participate in 
determining what counts as progress? 

■ How do measures of program impact, such as an increase in 
reading to one's children or a job promotion, correlate with 
increases in English-language proficiency? 

■ How might a national proficiency scale facilitate the reporting 

of learner progress and program impact? How effective is the NRS 
scale in reporting learner gains? 

■ Which assessment instruments can reliably document changes in 
learner performance at what levels? Can these instruments serve all 
types of adult ESOL programs? 

■ What changes in program design and staff development are needed 
to ensure that current and new assessment tools are reliably used? 

■ How can technology facilitate the implementation of a 
system for documenting learner outcomes and program 
impact? 

■ How do local, state, and national policies affect assessment 
tools and practices, and what policies need to be created? 



CONCLUSION 

Whatever the decisions that will be made in the near future about 
assessment in adult ESOL, it is clear that policymakers, practitioners, and 
researchers must engage in a collaborative effort to produce substantive, 
purposeful, and effective change in the field. The U.S. Department of 
Education's (ED) Office of Vocational and Adult Education continues to 
make efforts to listen to stakeholders through such venues as the biannual 
meetings of the State Directors of Adult Education and the studies and 
the clearinghouse it funds. The ED also sponsors a National Forum on 
Adult Education and Literacy, which in 1997 brought in learners from 
across the United States and, in 1998, teachers, to discuss adult education 
issues. However, more substantive efforts are needed to create an 




infrastructure that can sustain a comprehensive adult education system. 
At local, state, and national levels, representatives from each stakeholder 
group (policymakers, practitioners, researchers, and learners) should 
develop a plan of action that supports the implementation of the new 
legislative imperatives and program designs that effectively and 
efficiently serve the needs of the community and the learners. 

Appendix: Ordering Information for Selected Standardized Tests 

ABLE (Adult Basic Learning Examination) 

Psychological Corporation 
Order Service Center 
P.O. Box 839954 
San Antonio, TX 78283 
(800) 211-8378 

A-LAS (Adult Language Assessment Scales) 

McGraw-Hill 

1221 Avenue of the Americas 
New York, NY 10020 
(800) 624-7294 
http://www.mhcollege.com 

BEST (Basic English Skills Test) 

Center for Applied Linguistics 
4646 Fortieth Street, N.W. 

Washington, DC 20016nl859 
(202) 362-0700 
http://www.cal.org 

CASAS (Comprehensive Adult Student Assessment System) 

8910 Clairemont Mesa Boulevard 
San Diego, CA 92123 
(619) 292-2900 
http://www.casas.org 

NYS Place Test (New York State Placement Test for ESOL 
Adult Students) 

City School District of Albany 
Albany Educational TV 
27 Western Avenue 
Albany, NY 12203 
(518) 462-7292, ext. 30 

TABE (Test of Adult Basic Education) 

CTB/McGraw-Hill 



20 Ryan Ranch 
Monterey, CA 93940 
(800) 538-9547 
http://www.ctb.com 
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